105 research outputs found
Interpretable Adversarial Training for Text
Generating high-quality and interpretable adversarial examples in the text
domain is a much more daunting task than it is in the image domain. This is due
partly to the discrete nature of text, partly to the problem of ensuring that
the adversarial examples are still probable and interpretable, and partly to
the problem of maintaining label invariance under input perturbations. In order
to address some of these challenges, we introduce sparse projected gradient
descent (SPGD), a new approach to crafting interpretable adversarial examples
for text. SPGD imposes a directional regularization constraint on input
perturbations by projecting them onto the directions to nearby word embeddings
with highest cosine similarities. This constraint ensures that perturbations
move each word embedding in an interpretable direction (i.e., towards another
nearby word embedding). Moreover, SPGD imposes a sparsity constraint on
perturbations at the sentence level by ignoring word-embedding perturbations
whose norms are below a certain threshold. This constraint ensures that our
method changes only a few words per sequence, leading to higher quality
adversarial examples. Our experiments with the IMDB movie review dataset show
that the proposed SPGD method improves adversarial example interpretability and
likelihood (evaluated by average per-word perplexity) compared to
state-of-the-art methods, while suffering little to no loss in training
performance
A Power Efficient Sensing/Communication Scheme: Joint Source-Channel-Network Coding by Using Compressive Sensing
We propose a joint source-channel-network coding scheme, based on compressive
sensing principles, for wireless networks with AWGN channels (that may include
multiple access and broadcast), with sources exhibiting temporal and spatial
dependencies. Our goal is to provide a reconstruction of sources within an
allowed distortion level at each receiver. We perform joint source-channel
coding at each source by randomly projecting source values to a lower
dimensional space. We consider sources that satisfy the restricted eigenvalue
(RE) condition as well as more general sources for which the randomness of the
network allows a mapping to lower dimensional spaces. Our approach relies on
using analog random linear network coding. The receiver uses compressive
sensing decoders to reconstruct sources. Our key insight is the fact that,
compressive sensing and analog network coding both preserve the source
characteristics required for compressive sensing decoding.Comment: Presented at Allerton Conference 201
Certifiably Robust Interpretation in Deep Learning
Deep learning interpretation is essential to explain the reasoning behind
model predictions. Understanding the robustness of interpretation methods is
important especially in sensitive domains such as medical applications since
interpretation results are often used in downstream tasks. Although
gradient-based saliency maps are popular methods for deep learning
interpretation, recent works show that they can be vulnerable to adversarial
attacks. In this paper, we address this problem and provide a certifiable
defense method for deep learning interpretation. We show that a sparsified
version of the popular SmoothGrad method, which computes the average saliency
maps over random perturbations of the input, is certifiably robust against
adversarial perturbations. We obtain this result by extending recent bounds for
certifiably robust smooth classifiers to the interpretation setting.
Experiments on ImageNet samples validate our theory
Maximum Likelihood Latent Space Embedding of Logistic Random Dot Product Graphs
A latent space model for a family of random graphs assigns real-valued
vectors to nodes of the graph such that edge probabilities are determined by
latent positions. Latent space models provide a natural statistical framework
for graph visualizing and clustering. A latent space model of particular
interest is the Random Dot Product Graph (RDPG), which can be fit using an
efficient spectral method; however, this method is based on a heuristic that
can fail, even in simple cases. Here, we consider a closely related latent
space model, the Logistic RDPG, which uses a logistic link function to map from
latent positions to edge likelihoods. Over this model, we show that
asymptotically exact maximum likelihood inference of latent position vectors
can be achieved using an efficient spectral method. Our method involves
computing top eigenvectors of a normalized adjacency matrix and scaling
eigenvectors using a regression step. The novel regression scaling step is an
essential part of the proposed method. In simulations, we show that our
proposed method is more accurate and more robust than common practices. We also
show the effectiveness of our approach over standard real networks of the
karate club and political blogs
Normalized Wasserstein Distance for Mixture Distributions with Applications in Adversarial Learning and Domain Adaptation
Understanding proper distance measures between distributions is at the core
of several learning tasks such as generative models, domain adaptation,
clustering, etc. In this work, we focus on mixture distributions that arise
naturally in several application domains where the data contains different
sub-populations. For mixture distributions, established distance measures such
as the Wasserstein distance do not take into account imbalanced mixture
proportions. Thus, even if two mixture distributions have identical mixture
components but different mixture proportions, the Wasserstein distance between
them will be large. This often leads to undesired results in distance-based
learning methods for mixture distributions. In this paper, we resolve this
issue by introducing the Normalized Wasserstein measure. The key idea is to
introduce mixture proportions as optimization variables, effectively
normalizing mixture proportions in the Wasserstein formulation. Using the
proposed normalized Wasserstein measure leads to significant performance gains
for mixture distributions with imbalanced mixture proportions compared to the
vanilla Wasserstein distance. We demonstrate the effectiveness of the proposed
measure in GANs, domain adaptation and adversarial clustering in several
benchmark datasets.Comment: Accepted at ICCV 201
Fantastic Four: Differentiable Bounds on Singular Values of Convolution Layers
In deep neural networks, the spectral norm of the Jacobian of a layer bounds
the factor by which the norm of a signal changes during forward/backward
propagation. Spectral norm regularizations have been shown to improve
generalization, robustness and optimization of deep learning methods. Existing
methods to compute the spectral norm of convolution layers either rely on
heuristics that are efficient in computation but lack guarantees or are
theoretically-sound but computationally expensive. In this work, we obtain the
best of both worlds by deriving {\it four} provable upper bounds on the
spectral norm of a standard 2D multi-channel convolution layer. These bounds
are differentiable and can be computed efficiently during training with
negligible overhead. One of these bounds is in fact the popular heuristic
method of Miyato et al. (multiplied by a constant factor depending on filter
sizes). Each of these four bounds can achieve the tightest gap depending on
convolution filters. Thus, we propose to use the minimum of these four bounds
as a tight, differentiable and efficient upper bound on the spectral norm of
convolution layers. We show that our spectral bound is an effective regularizer
and can be used to bound either the lipschitz constant or curvature values
(eigenvalues of the Hessian) of neural networks. Through experiments on MNIST
and CIFAR-10, we demonstrate the effectiveness of our spectral bound in
improving generalization and provable robustness of deep networks.Comment: Accepted at ICLR, 202
(De)Randomized Smoothing for Certifiable Defense against Patch Attacks
Patch adversarial attacks on images, in which the attacker can distort pixels
within a region of bounded size, are an important threat model since they
provide a quantitative model for physical adversarial attacks. In this paper,
we introduce a certifiable defense against patch attacks that guarantees for a
given image and patch attack size, no patch adversarial examples exist. Our
method is related to the broad class of randomized smoothing robustness schemes
which provide high-confidence probabilistic robustness certificates. By
exploiting the fact that patch attacks are more constrained than general sparse
attacks, we derive meaningfully large robustness certificates against them.
Additionally, in contrast to smoothing-based defenses against L_p and sparse
attacks, our defense method against patch attacks is de-randomized, yielding
improved, deterministic certificates. Compared to the existing patch
certification method proposed by Chiang et al. (2020), which relies on interval
bound propagation, our method can be trained significantly faster, achieves
high clean and certified robust accuracy on CIFAR-10, and provides certificates
at ImageNet scale. For example, for a 5-by-5 patch attack on CIFAR-10, our
method achieves up to around 57.6% certified accuracy (with a classifier with
around 83.8% clean accuracy), compared to at most 30.3% certified accuracy for
the existing method (with a classifier with around 47.8% clean accuracy). Our
results effectively establish a new state-of-the-art of certifiable defense
against patch attacks on CIFAR-10 and ImageNet. Code is available at
https://github.com/alevine0/patchSmoothing.Comment: NeurIPS 202
Understanding GANs: the LQG Setting
Generative Adversarial Networks (GANs) have become a popular method to learn
a probability model from data. In this paper, we aim to provide an
understanding of some of the basic issues surrounding GANs including their
formulation, generalization and stability on a simple benchmark where the data
has a high-dimensional Gaussian distribution. Even in this simple benchmark,
the GAN problem has not been well-understood as we observe that existing
state-of-the-art GAN architectures may fail to learn a proper generative
distribution owing to (1) stability issues (i.e., convergence to bad local
solutions or not converging at all), (2) approximation issues (i.e., having
improper global GAN optimizers caused by inappropriate GAN's loss functions),
and (3) generalizability issues (i.e., requiring large number of samples for
training). In this setup, we propose a GAN architecture which recovers the
maximum-likelihood solution and demonstrates fast generalization. Moreover, we
analyze global stability of different computational approaches for the proposed
GAN optimization and highlight their pros and cons. Finally, we outline an
extension of our model-based approach to design GANs in more complex setups
than the considered Gaussian benchmark
Robustness Certificates Against Adversarial Examples for ReLU Networks
While neural networks have achieved high performance in different learning
tasks, their accuracy drops significantly in the presence of small adversarial
perturbations to inputs. Defenses based on regularization and adversarial
training are often followed by new attacks to defeat them. In this paper, we
propose attack-agnostic robustness certificates for a multi-label
classification problem using a deep ReLU network. Although computing the exact
distance of a given input sample to the classification decision boundary
requires solving a non-convex optimization, we characterize two lower bounds
for such distances, namely the simplex certificate and the decision boundary
certificate. These robustness certificates leverage the piece-wise linear
structure of ReLU networks and use the fact that in a polyhedron around a given
sample, the prediction function is linear. In particular, the proposed simplex
certificate has a closed-form, is differentiable and is an order of magnitude
faster to compute than the existing methods even for deep networks. In addition
to theoretical bounds, we provide numerical results for our certificates over
MNIST and compare them with some existing upper bounds
Deep Partition Aggregation: Provable Defense against General Poisoning Attacks
Adversarial poisoning attacks distort training data in order to corrupt the
test-time behavior of a classifier. A provable defense provides a certificate
for each test sample, which is a lower bound on the magnitude of any
adversarial distortion of the training set that can corrupt the test sample's
classification. We propose two novel provable defenses against poisoning
attacks: (i) Deep Partition Aggregation (DPA), a certified defense against a
general poisoning threat model, defined as the insertion or deletion of a
bounded number of samples to the training set -- by implication, this threat
model also includes arbitrary distortions to a bounded number of images and/or
labels; and (ii) Semi-Supervised DPA (SS-DPA), a certified defense against
label-flipping poisoning attacks. DPA is an ensemble method where base models
are trained on partitions of the training set determined by a hash function.
DPA is related to both subset aggregation, a well-studied ensemble method in
classical machine learning, as well as to randomized smoothing, a popular
provable defense against evasion attacks. Our defense against label-flipping
attacks, SS-DPA, uses a semi-supervised learning algorithm as its base
classifier model: each base classifier is trained using the entire unlabeled
training set in addition to the labels for a partition. SS-DPA significantly
outperforms the existing certified defense for label-flipping attacks on both
MNIST and CIFAR-10: provably tolerating, for at least half of test images, over
600 label flips (vs. < 200 label flips) on MNIST and over 300 label flips (vs.
175 label flips) on CIFAR-10. Against general poisoning attacks, where no prior
certified defenses exists, DPA can certify >= 50% of test images against over
500 poison image insertions on MNIST, and nine insertions on CIFAR-10. These
results establish new state-of-the-art provable defenses against poisoning
attacks.Comment: ICLR 202
- …